indian institute
Physics-guided Emulators Reveal Resilience and Fragility under Operational Latencies and Outages
Dubey, Sarth, Ghosh, Subimal, Bhatia, Udit
Reliable hydrologic and flood forecasting requires models that remain stable when input data are delayed, missing, or inconsistent. However, most advances in rainfall-runoff prediction have been evaluated under ideal data conditions, emphasizing accuracy rather than operational resilience. Here, we develop an operationally ready emulator of the Global Flood Awareness System (GloFAS) that couples long-and short-term memory networks with a relaxed water-balance constraint to preserve physical coherence. Five architectures span a continuum of information availability: from complete historical and forecast forcings to scenarios with data latency and outages, allowing systematic evaluation of robustness. Trained in minimally managed catchments across the United States and tested in more than 5,000 basins, including heavily regulated rivers in India, the emulator reproduces the hydrological core of GloFAS and degrades smoothly as information quality declines. The framework establishes operational robustness as a measurable property of hydrological machine learning and advances the design of reliable real-time forecasting systems. Catchment response to precipitation varies in space and time with climate, storage dynamics, and human regulation, making reliable prediction dependent on both data availability and model adaptability [3, 4]. Although advances in observations, reanalysis products, and computational methods have expanded predictive capability [5-9], translating this progress into forecasting systems that operate continuously and robustly in real time remains unresolved. Operational forecasting requires models that sustain accuracy and physical realism when input data are asynchronous, incomplete, or inconsistent with the conditions used for training, and that can do so with limited human intervention [10-12].
- North America > United States (0.66)
- Asia > India > Gujarat > Gandhinagar (0.05)
- Asia > India > Maharashtra > Mumbai (0.04)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Architecture > Real Time Systems (0.88)
Crystal Systems Classification of Phosphate-Based Cathode Materials Using Machine Learning for Lithium-Ion Battery
Yadav, Yogesh, Yadav, Sandeep K, Vijay, Vivek, Dixit, Ambesh
The physical and chemical characteristics of cathodes used in batteries are derived from the lithium-ion phosphate cathodes crystalline arrangement, which is pivotal to the overall battery performance. Therefore, the correct prediction of the crystal system is essential to estimate the properties of cathodes. This study applies machine learning classification algorithms for predicting the crystal systems, namely monoclinic, orthorhombic, and triclinic, related to Li P (Mn, Fe, Co, Ni, V) O based Phosphate cathodes. The data used in this work is extracted from the Materials Project. Feature evaluation showed that cathode properties depend on the crystal structure, and optimized classification strategies lead to better predictability. Ensemble machine learning algorithms such as Random Forest, Extremely Randomized Trees, and Gradient Boosting Machines have demonstrated the best predictive capabilities for crystal systems in the Monte Carlo cross-validation test. Additionally, sequential forward selection (SFS) is performed to identify the most critical features influencing the prediction accuracy for different machine learning models, with Volume, Band gap, and Sites as input features ensemble machine learning algorithms such as Random Forest (80.69%), Extremely Randomized Tree (78.96%), and Gradient Boosting Machine (80.40%) approaches lead to the maximum accuracy towards crystallographic classification with stability and the predicted materials can be the potential cathode materials for lithium ion batteries.
- Energy > Energy Storage (1.00)
- Electrical Industrial Apparatus (1.00)
GroMo: Plant Growth Modeling with Multiview Images
Bhatt, Ruchi, Bansal, Shreya, Chander, Amanpreet, Kaur, Rupinder, Singh, Malya, Kankanhalli, Mohan, Saddik, Abdulmotaleb El, Saini, Mukesh Kumar
Understanding plant growth dynamics is essential for applications in agriculture and plant phenotyping. We present the Growth Modelling (GroMo) challenge, which is designed for two primary tasks: (1) plant age prediction and (2) leaf count estimation, both essential for crop monitoring and precision agriculture. For this challenge, we introduce GroMo25, a dataset with images of four crops: radish, okra, wheat, and mustard. Each crop consists of multiple plants (p1, p2, ..., pn) captured over different days (d1, d2, ..., dm) and categorized into five levels (L1, L2, L3, L4, L5). Each plant is captured from 24 different angles with a 15-degree gap between images. Participants are required to perform both tasks for all four crops with these multiview images. We proposed a Multiview Vision Transformer (MVVT) model for the GroMo challenge and evaluated the crop-wise performance on GroMo25. MVVT reports an average MAE of 7.74 for age prediction and an MAE of 5.52 for leaf count. The GroMo Challenge aims to advance plant phenotyping research by encouraging innovative solutions for tracking and predicting plant growth. The GitHub repository is publicly available at https://github.com/mriglab/GroMo-Plant-Growth-Modeling-with-Multiview-Images.
Everyday Speech in the Indian Subcontinent
Pathak, Utkarsh, Gunda, Chandra Sai Krishna, Sathiyamoorthy, Sujitha, Agarwal, Keshav, Murthy, Hema A.
India has 1369 languages of which 22 are official. About 13 different scripts are used to represent these languages. A Common Label Set (CLS) was developed based on phonetics to address the issue of large vocabulary of units required in the End to End (E2E) framework for multilingual synthesis. This reduced the footprint of the synthesizer and also enabled fast adaptation to new languages which had similar phonotactics, provided language scripts belonged to the same family. In this paper, we provide new insights into speech synthesis, where the script belongs to one family, while the phonotactics comes from another. Indian language text is first converted to CLS, and then a synthesizer that matches the phonotactics of the language is used. Quality akin to that of a native speaker is obtained for Sanskrit and Konkani with zero adaptation data, using Kannada and Marathi synthesizers respectively. Further, this approach also lends itself seamless code switching across 13 Indian languages and English in a given native speaker's voice.
- Asia > India > Tamil Nadu > Chennai (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- (2 more...)
G$^{2}$TR: Generalized Grounded Temporal Reasoning for Robot Instruction Following by Combining Large Pre-trained Models
Arora, Riya, Narendranath, Niveditha, Tambi, Aman, Zachariah, Sandeep S., Chakraborty, Souvik, Paul, Rohan
Consider the scenario where a human cleans a table and a robot observing the scene is instructed with the task "Remove the cloth using which I wiped the table". Instruction following with temporal reasoning requires the robot to identify the relevant past object interaction, ground the object of interest in the present scene, and execute the task according to the human's instruction. Directly grounding utterances referencing past interactions to grounded objects is challenging due to the multi-hop nature of references to past interactions and large space of object groundings in a video stream observing the robot's workspace. Our key insight is to factor the temporal reasoning task as (i) estimating the video interval associated with event reference, (ii) performing spatial reasoning over the interaction frames to infer the intended object (iii) semantically track the object's location till the current scene to enable future robot interactions. Our approach leverages existing large pre-trained models (which possess inherent generalization capabilities) and combines them appropriately for temporal grounding tasks. Evaluation on a video-language corpus acquired with a robot manipulator displaying rich temporal interactions in spatially-complex scenes displays an average accuracy of 70.10%. The dataset, code, and videos are available at https://reail-iitdelhi.github.io/temporalreasoning.github.io/ .
Interview with AAAI Fellow Anima Anandkumar: Neural Operators for science and engineering problems
Each year the Association for the Advancement of Artificial Intelligence (AAAI) recognizes a group of individuals who have made significant, sustained contributions to the field of artificial intelligence by appointing them as Fellows. We've been talking to some of the 2024 AAAI Fellows to find out more about their research. In this interview, we meet Anima Anandkumar and find out about her work on Neural Operators, of which she is the inventor. Neural Operators are able to learn complex physical phenomena that occur at multiple resolutions while standard neural networks are unable to do so. Standard neural networks use a fixed number of pixels or resolution to learn a phenomenon, while neural operators represent data as continuous functions.
- Information Technology (0.51)
- Health & Medicine (0.37)
A Hybrid-Layered System for Image-Guided Navigation and Robot Assisted Spine Surgery
T, Suhail Ansari, Maik, Vivek, Naheem, Minhas, Ram, Keerthi, Lakshmanan, Manojkumar, Sivaprakasam, Mohanasankar
In response to the growing demand for precise and affordable solutions for Image-Guided Spine Surgery (IGSS), this paper presents a comprehensive development of a Robot-Assisted and Navigation-Guided IGSS System. The endeavor involves integrating cutting-edge technologies to attain the required surgical precision and limit user radiation exposure, thereby addressing the limitations of manual surgical methods. We propose an IGSS workflow and system architecture employing a hybrid-layered approach, combining modular and integrated system architectures in distinctive layers to develop an affordable system for seamless integration, scalability, and reconfigurability. We developed and integrated the system and extensively tested it on phantoms and cadavers. The proposed system's accuracy using navigation guidance is 1.020 mm, and robot assistance is 1.11 mm on phantoms. Observing a similar performance in cadaveric validation where 84% of screw placements were grade A, 10% were grade B using navigation guidance, 90% were grade A, and 10% were grade B using robot assistance as per the Gertzbein-Robbins scale, proving its efficacy for an IGSS. The evaluated performance is adequate for an IGSS and at par with the existing systems in literature and those commercially available. The user radiation is lower than in the literature, given that the system requires only an average of 3 C-Arm images per pedicle screw placement and verification
- Asia > India > Tamil Nadu > Chennai (0.04)
- Europe > Italy (0.04)
- Asia > Vietnam > Quảng Ninh Province > Hạ Long (0.04)
- Health & Medicine > Surgery (1.00)
- Health & Medicine > Therapeutic Area > Orthopedics/Orthopedic Surgery (0.72)
- Health & Medicine > Therapeutic Area > Musculoskeletal (0.72)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
MedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineering
Srivastava, Roomani, Prasad, Suraj, Bhat, Lipika, Deshpande, Sarvesh, Das, Barnali, Jadhav, Kshitij
A major roadblock in the seamless digitization of medical records remains the lack of interoperability of existing records. Extracting relevant medical information required for further treatment planning or even research is a time consuming labour intensive task involving expenditure of valuable time of doctors. In this demo paper we present, MedPromptExtract an automated tool using a combination of semi supervised learning, large language models, natural language processing and prompt engineering to convert unstructured medical records to structured data which is amenable for further analysis.
- Asia > India > Maharashtra > Mumbai (0.06)
- North America > United States > Idaho > Ada County > Boise (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Health Care Technology > Medical Record (1.00)
Classification of executive functioning performance post-longitudinal tDCS using functional connectivity and machine learning methods
Rao, Akash K, Menon, Vishnu K, Uttrani, Shashank, Dixit, Ayushman, Verma, Dipanshu, Dutt, Varun
Executive functioning is a cognitive process that enables humans to plan, organize, and regulate their behavior in a goal-directed manner. Understanding and classifying the changes in executive functioning after longitudinal interventions (like transcranial direct current stimulation (tDCS)) has not been explored in the literature. This study employs functional connectivity and machine learning algorithms to classify executive functioning performance post-tDCS. Fifty subjects were divided into experimental and placebo control groups. EEG data was collected while subjects performed an executive functioning task on Day 1. The experimental group received tDCS during task training from Day 2 to Day 8, while the control group received sham tDCS. On Day 10, subjects repeated the tasks specified on Day 1. Different functional connectivity metrics were extracted from EEG data and eventually used for classifying executive functioning performance using different machine learning algorithms. Results revealed that a novel combination of partial directed coherence and multi-layer perceptron (along with recursive feature elimination) resulted in a high classification accuracy of 95.44%. We discuss the implications of our results in developing real-time neurofeedback systems for assessing and enhancing executive functioning performance post-tDCS administration.
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
- Asia > India > Himachal Pradesh (0.04)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Health Care Technology (0.94)
Effect of dimensionality change on the bias of word embeddings
Word embedding methods (WEMs) are extensively used for representing text data. The dimensionality of these embeddings varies across various tasks and implementations. The effect of dimensionality change on the accuracy of the downstream task is a well-explored question. However, how the dimensionality change affects the bias of word embeddings needs to be investigated. Using the English Wikipedia corpus, we study this effect for two static (Word2Vec and fastText) and two context-sensitive (ElMo and BERT) WEMs. We have two observations. First, there is a significant variation in the bias of word embeddings with the dimensionality change. Second, there is no uniformity in how the dimensionality change affects the bias of word embeddings. These factors should be considered while selecting the dimensionality of word embeddings.
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Software (0.95)